"TalkPrinting": Improving Speaker Recognition by Modeling Stylistic Features

نویسندگان

  • Sachin S. Kajarekar
  • M. Kemal Sönmez
  • Luciana Ferrer
  • Venkata Ramana Rao Gadde
  • Anand Venkataraman
  • Elizabeth Shriberg
  • Andreas Stolcke
  • Harry Bratt
چکیده

Automatic speaker recognition is an important technology for intelligence gathering, law enforcement, and audio mining. Conventional speaker recognition systems, which are based on independent short-term spectral samples, suffer from a lack of noise robustness and are unable to model a speaker’s idiosyncratic stylistic features. This paper describes “TalkPrinting”, a program of research aimed at adding such stylistic features to conventional systems. Results on three preliminary systems based on stylistic features demonstrate that (1) the new features alone carry significant speaker information; (2) they also carry significant complementary information compared to the conventional features; and (3) they provide increasing improvements in performance with increasing test durations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Modulation features for noise robust speaker identification

Current state-of-the-art speaker identification (SID) systems perform exceptionally well under clean conditions, but their performance deteriorates when noise and channel degradations are introduced. Literature has mostly focused on robust modeling techniques to combat degradations due to background noise and/or channel effects, and have demonstrated significant improvement in SID performance i...

متن کامل

Effect of Gender on Improving Speech Recognition System

Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...

متن کامل

Effect of Gender on Improving Speech Recognition System

Speech is the output of a time varying excitation excited by a time varying system. It generates pulses with fundamental frequency F0. This time varying impulse trained as one of the features, characterized by fundamental frequencyF0and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper the effect of gender on improving...

متن کامل

ASR Dependent Techniques for Speaker Recognition

This thesis is concerned with improving the performance of speaker recognition systems in three areas: speaker modeling, verification score computation, and feature extraction in telephone quality speech. We first seek to improve upon traditional modeling approaches for speaker recognition, which are based on Gaussian Mixture Models (GMMs) trained globally over all speech from a given speaker. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003